Dereverberation of autoregressive envelopes for far-field speech recognition

نویسندگان

چکیده

The task of speech recognition in far-field environments is adversely affected by the reverberant artifacts that elicit as temporal smearing sub-band envelopes. In this paper, we develop a neural model for dereverberation using long-term envelopes speech. are derived frequency domain linear prediction (FDLP) which performs an autoregressive estimation Hilbert estimates envelope gain when applied to signals suppresses late reflection components signal. dereverberated used feature extraction recognition. Further, sequence steps involved dereverberation, and acoustic modeling ASR can be implemented single processing pipeline allows joint learning network model. Several experiments performed on REVERB challenge dataset, CHiME-3 dataset VOiCES dataset. these experiments, yields significant performance improvements over baseline system based log-mel spectrogram well other past approaches (average relative 10–24% system). A detailed analysis choice hyper-parameters cost function also provided.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Adaptive Multichannel Dereverberation for Automatic Speech Recognition

Reverberation is known to degrade the performance of automatic speech recognition (ASR) systems dramatically in farfield conditions. Adopting the weighted prediction error (WPE) approach, we formulate an online dereverberation algorithm for a multi-microphone array. The key contributions of this paper are: (a) we demonstrate that dereverberation using WPE improves performance even when the acou...

متن کامل

Coherence-based Dereverberation for Automatic Speech Recognition

The idea of performing dereverberation using a short-time spatial coherence estimate dates back to 1977 [1], when it was proposed to essentially use the magnitude of the coherence as gain for reverberation suppression. Another heuristic method was recently proposed in [2], where a soft threshold function is used to compute a gain from the coherence magnitude, and the parameters of the threshold...

متن کامل

Tracking and Far-Field Speech Recognition for Multiple Simultaneous Speakers

In prior work, we developed a speaker tracking system based on an extended Kalman filter using time delays of arrival (TDOAs) as acoustic features. While this system functioned well, its utility was limited to scenarios in which a single speaker was to be tracked. In this work, we remove this restriction by generalizing the IEKF, first to a probabilistic data association filter, which incorpora...

متن کامل

Feature mapping using far-field microphones for distant speech recognition

Acoustic modeling based on deep architectures has recently gained remarkable success, with substantial improvement of speech recognition accuracy in several automatic speech recognition (ASR) tasks. For distant speech recognition, the multi-channel deep neural network based approaches rely on the powerful modeling capability of deep neural network (DNN) to learn suitable representation of dista...

متن کامل

Hilbert Envelope Based Features for Far-Field Speech Recognition

Automatic speech recognition (ASR) systems, trained on speech signals from close-talking microphones, generally fail in recognizing far-field speech. In this paper, we present a Hilbert Envelope based feature extraction technique to alleviate the artifacts introduced by room reverberations. The proposed technique is based on modeling temporal envelopes of the speech signal in narrow sub-bands u...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computer Speech & Language

سال: 2022

ISSN: ['1095-8363', '0885-2308']

DOI: https://doi.org/10.1016/j.csl.2021.101277